PDF Workflow

A PDF workflow, in essence, is one that does not contain any Connect template or Design document and only uses PDF files as data files. The idea is that a PDF file, because it is a formatted document in and of itself, doesn't need to go through a merge process before it can be printed.

PlanetPress Workflow provides a few tasks specifically designed to work with PDFs:

In most cases, this kind of workflow also implies the use of Metadata tasks (see Metadata Tasks).
You can use Metadata tasks to group, sort and sequence (split) the PDF data. The Create PDF task will apply the active Metadata to the PDF data file before creating the PDF output.

Things to keep in mind while working with Metadata are set forth in another topic: Working with Metadata.

In Connect it is also possible to group, sort and split PDF data using OL Connect tasks.

Example: Daily sales report from PDF files

This workflow makes heavy use of PDF tasks and Metadata, and assumes that you are using PlanetPress Workflow version 7.3 or higher.

This single process workflow generates a daily sales report for any sales representative inside of a company which made at least one sale. It does this by capturing the invoices generated within a specific day, putting all the invoices for each sales representative in a single PDF and then sending it to the sales representative. It does this using several specific Metadata tasks as well as a quick lookup in an external Excel spreadsheet.

Resources

Task Breakdown

  • The initial input is the Merge PDF Files, which retrieves and merges all the PDF files inside of the specified folder. Once a single PDF is created, the task also optimizes the PDF (to avoid duplicating images and font definitions for each page) as well as generates a basic Metadata structure containing a single document with one Data Page per captured PDF.
  • The Metadata Level Creation creates the Document level of the Metadata by placing each PDF data file in its own Document level. It does this by detecting when the Address in the document changes.
  • Then, the Metadata Fields Management adds a few fields at the Document level in order to properly tag each document with the appropriate information, in this case the Customer ID, Country and Rep ID. These fields are used for the following Metadata tasks.
  • The Metadata Filter follows by removing any invoice that is not in the US. Note that the Metadata filter is an *inclusive* filter, meaning that the filter includes the parts of the Metadata where the result of the filter is true, and filters out anything else.
  • The Metadata Sorter then re-orders the Metadata documents by Rep ID, so that all of the invoices for any particular sales representative are all together.
  • Lookup in Microsoft® Excel® Documents then uses the Rep ID field to retrieve each sales representative's email from a specific Excel spreadsheet.
  • The Metadata Sequencer acts like a splitter, where the separation happens whenever the Rep ID changes. Since documents are sorted with that field, each sequence can contain one or more document, but they will all be for the same Rep ID.
  • Create PDF is then used to generate a single PDF for each sales representative. Because Create PDF works in conjunction with Metadata and because it can be used in pass-through mode, in this instance it will only take the relevant PDF pages from the original data file in order to create a single PDF file. Other than the extraction of these pages, the original concatenated data file is untouched.
  • Finally, the output is done using a Send to Folder in this case. Obviously, this should be a Send Email output, but since we don't want to spam anyone, instead we place the PDF in a folder with the Rep ID's email as a folder name.